Breaking SVM Complexity with Cross-Training

نویسندگان

  • Gökhan H. Bakir
  • Léon Bottou
  • Jason Weston
چکیده

We propose to selectively remove examples from the training set using probabilistic estimates related to editing algorithms (Devijver and Kittler, 1982). This heuristic procedure aims at creating a separable distribution of training examples with minimal impact on the position of the decision boundary. It breaks the linear dependency between the number of SVs and the number of training examples, and sharply reduces the complexity of SVMs during both the training and prediction stages.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Breaking the curse of kernelization: budgeted stochastic gradient descent for large-scale SVM training

Online algorithms that process one example at a time are advantageous when dealing with very large data or with data streams. Stochastic Gradient Descent (SGD) is such an algorithm and it is an attractive choice for online Support Vector Machine (SVM) training due to its simplicity and effectiveness. When equipped with kernel functions, similarly to other SVM learning algorithms, SGD is suscept...

متن کامل

Improving Efficiency of SVM k-Fold Cross-Validation by Alpha Seeding

The k-fold cross-validation is commonly used to evaluate the effectiveness of SVMs with the selected hyper-parameters. It is known that the SVM k-fold cross-validation is expensive, since it requires training k SVMs. However, little work has explored reusing the h SVM for training the (h+ 1) SVM for improving the efficiency of k-fold cross-validation. In this paper, we propose three algorithms ...

متن کامل

Prediction with the SVM Using Test Point Margins

Support vector machines (SVMs) carry out binary classification by constructing a maximal margin hyperplane between the two classes of observed (training) examples and then classifying test points according to the half-spaces in which they reside (irrespective of the distances that may exist between the test examples and the hyperplane). Cross-validation involves finding the one SVM model togeth...

متن کامل

Support vector machine classification for large datasets using decision tree and Fisher linear discriminant

The training of a support vector machine (SVM) has a time complexity between O(n) and O(n). Most training algorithms for SVM are not suitable for large data sets. Decision trees can simplify SVM training, however the classification accuracy becomes lower when there are inseparable points. This paper introduces a novel method for SVM classification. A decision tree is used to detect low entropy ...

متن کامل

Parallel multitask cross validation for Support Vector Machine using GPU

The Support Vector Machine (SVM) is an efficient tool in machine learning with high accuracy performance. However, in order to achieve the highest accuracy performance, n-fold cross validation is commonly used to identify the best hyperparameters for SVM. This becomes a weak point of SVM due to the extremely long training time for various hyperparameters of different kernel functions. In this p...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004